AI Forensics

Final symposium, “AI and the Society of the Future”, VolkswagenStiftung

Patrick Riechert

HfG Karlsruhe

Giulia Gandolfi

HfG Karlsruhe

23. April, 2025

AI Forensics: Accountability through interpretability in visual AI systems.

Staatliche Hochschule für Gestaltung Karlsruhe

Künstliche Intelligenz und Medienphilosophie
Prof. Matteo Pasquinelli

Universität Kassel

Dept. of Participatory IT Design
Prof. Claude Draude

Durham University

Dept. of Computer Science
Prof. Noura Al Moubayed

Cambridge University

Cambridge Digital Humanities
Dr. Leonardo Impett

University of California, Santa Barbara

Center for the Humanities and Machine Learning
Dr. Fabian Offert

AI Forensics: Accountability through interpretability in visual AI systems.

Figure 1

Outline:

  • the original plan: motivation and structure
  • confluence of complications and a crisis of concept
  • research trajectories, strategies, results
  • re/discovering the “soul of the project”
  • outputs ahead & look back

Original formulation

Project aim

enable public accountability through interpretability

technical investigations rarely lead to practical, interactive instruments that favour accountability and, conversely, critical studies of accountability are still lacking tools for a technical analysis of the AI black box.

So implies also to bridge gaps between

  • academic disciplines (critical AI studies; explainable AI/ML)
  • societal sectors (research, activism, art)

Accountability through interpretability

  • developing—and demonstrating—a “generalized sociotechnical methodology”, commensurate to the effects AI brings on a societal level
  • dogfooding the creation of software—a platform of tools enabling an integrated, “full-stack” analysis, also for researchers new to AI analysis
Figure 2: AI Forensics as a synthesis of methodologies

Project dimensions

Integrated methodologies/approaches:

  • Sociohistorical
  • Technical
  • Participatory design

Disassembling and deobfuscating the ‘AI production pipeline’

  • Dataset
  • Model
  • Application

Two pillars/output orientations

  • Sociotechnical case studies
  • Forensics toolkit

Project structure

Sociotechnical case studies

  1. Exposing the production pipeline of visual AI systems
  2. AI Interpretability and accountability in the humanitarian sector
  3. AI design interventions for social diversity
  4. Interpretability and accountability of visual AI systems in the sciences

Toolkit components

  • Data provenance tool
  • Model analysis tool
  • Adversarial module

Project structure

Toolkit components

Mockup: integrated interface surfacing also e.g. contextual sociohistorical/epistemic aspects
  • Data provenance tool
  • Model analysis tool
  • Adversarial module (e.g. testing adversarial examples/patches)

Work plan

gantt
    dateFormat YY-MM
    axisFormat %Y W%V
    todayMarker off

    section Project coordination
   %% Application submission  :milestone, submitted, 2021-11-24, 1d
    Project Start :milestone, startv, 22-05, 1d
    Hiring Soc. Phds : hiresocs, after startv , 60d
    Hiring Code Phds : hirecode, after startv , 121d

    section Methodology design
    Inaugural workshop :crit, ws1 , after hiresocs , 1d
    Sociotech. case study research design :scsrd, after ws1 , 183d
    Second workshop :crit, ws2 , after hirecode , 61d

    section Implement method—Data 
    Dataset toolkit : datatk , after hirecode , 244d
    Sociotech. case study data coll. :scsdg, after scsrd , 61d
    Sociotech. case study data & model forensics :crit , scsdatamodforensics , after scsdg , 61d
    
    section Implement/co-develop Model analysis
    Model forensics dev. : modeltk , after scsdg , 427d
    Sociotech. case study model forensics, : scsmodfr , after scsdg , 122d
    Third workshop :crit, ws3 , after scsmodfr , 61d

    section Adversarial dev.
    Application forensics dev. :apptk , after scsmodfr , 397d
    Sociotech. case study model/appl. forensics : scsmodapplforensics , after ws3 , 61d
    Fourth workshop :crit, ws4 , after scsmodapplforensics , 61d

    section Sociotechnical Case-studies scholarly side
    Sociotech. case studies read-write : scsrw , after hiresocs , until integr

    section Integration
    Integration of toolkit and case studies :integr , after scsmodapplforensics , 365d

Figure 3: Original work plan implied intricate causal dependencies

Complications

The factors

Internal:

Organisational ‘entropy’

  • Shifting institutional constellation throughout
  • Hiring delays plus systemic rigidities
  • Temporal desynchronisation along three dimensions: 27mo. between projects; 4*academic calendars, 9h- time diff.

External:

Project begins 2022

  • pre-/post-“foundation model” worlds
  • discursive & technical moment:
    • transformer hegemony, language models, instruct/chat tuning
    • societal proliferation—incl. education, academia

gantt
    title Early timeline: AI Forensics and AI
    dateFormat  YYYY-MM-DD

    section Project-meta-news
    Draftproposal  :2021-06-15 , 1d
    Submission  :milestone, submitted, 2021-11-24, 0d
    Start :milestone, start2, 2022-11-15, 0d

    section AI Developments
    Instruct-tuning   :milestone, instruct, 2022-03-04, 0d
    Stable diffusion  :milestone, sd, 2022-09-01, 0d
    ChatGPT released  :milestone, chat, 2022-12-05, 0d
    LLama leaks       :milestone, lleak, 2023-03-03, 0d
    GPT-4       :milestone, lleak, 2023-03-14, 0d
    %% Timnit Gebru saga :

    section Case studies
    CS1a AI Regimes neural wiring visuality   :2023-03-01 , 2024-05-05
    CS2 Humanitarian AI    :2022-05-01 , 2024-05-05
    CS3 Design Social Diversity    :2022-11-01 , 2024-05-05
    CS4 AI in Science    :2023-01-31 , 2024-05-05

    section Forensics toolkit
    FTK-Data :2022-05-01 , 2023-05-01
    FTK-Data-postterm :done, 2023-05-01 , 2023-07-31
    FTK-Model :2023-03-01 , 2024-05-05
    %% FTK-App :2024-03-01 , 2024-05-05

    section MeetingsINT
    WS-sched-try1  :done, 2023-01-01  , 30d
    WS-sched-try2  :done, 2023-02-01  , 30d
    IRLmeeting     :2024-05-05 , 0d

The transformer paradigm

Foundation model/transformer/LLM era redrew AI map on a societal level—thus directly bore on the normative impetus of the project—

  • Architectural homogenisation: transformer/attention paradigm
  • Generative AI’s sudden affinity with language and culture

… and the toolkit’s practicability: scale and access

Dataset
  • ≈ Internet-scale
  • Platformisation, data-extractivism
Model
  • API-gating
  • Size, parameter count
Application
  • Vast proliferation|

Morale

  • Haphazard momentum at beginning (staffing, bureaucratically, planning-overhead)—

  • …coincided with discovery that a decent part of the common project was being rendered obsolete, assumptions had to be rethought…

  • and fed into a minor crisis of project coherence, including renaming discussions

…in retrospect: actually a productive time

What does interpreting AI mean?

AI Interpretability and Accountability in the Humanitarian Sector – Arif Kornweitz (PhD Candidate)

A Pedagogy of Machines: Technology in Education and Universities in Translation – Paolo Caffoni (PhD Candidate)

Matteo Pasquinelli (Supervisor) – KIM Research Group, HfG Karlsruhe

  • Technical idea of “mechanistic interpretability” is at stake—quite different from the notion of interpretation in humanities, social sciences, or cultural studies
  • A detour/return to study of art and culture e.g. (Pasquinelli and Kornweitz 2023; Kornweitz 2023) as well as (Impett 2023) and (Offert 2023)
  • Integration with the KIM colloquium and seminars of summer 2023 turns into a strong exploration/discussion asset
  • Shift to semiotics re: AI’s “linguistic turn”—nuancing common views on AI as ‘neo-structuralism’

Explainability beyond explainability

AI Design Interventions for Social Diversity – Goda Klumbytė (PhD Candidate) and Claude Draude (Supervisor), Participatory IT Design Group, Universität Kassel

  • How can explainability/interpretability/understandability be addressed through design- and interaction-related concepts?
    • Feminist epistemologies highlighting pluriversal perspectives and the situatedness of knowledge (Klumbyte, Piehl, and Draude 2023b, 2023a)
    • Mapping the ‘design diagrams’—e.g. the organisation of the production processes creating AI models
  • Explainability and understandability not merely as properties of an AI system, but as emerging in interaction—pointing to roles of
    • metaphors (e.g. focus on transparency as a pre-requisite)
    • sociocultural factors (e.g. uneven distribution of power and resources)
    • interaction modalities (e.g. embodiment),

Explainability, embodiment, and tangibility

🖇 Leonardo Angelini et al., ‘Tangible LLMs: Tangible Sense-Making For Trustworthy Large Language Models’, in Proceedings of the Nineteenth International Conference on Tangible, Embedded, and Embodied Interaction, TEI ’25 (ACM, 2025) https://doi.org/10.1145/3689050.3708338.

  • Tangible interaction modalities for core LLM concepts such as input/output embedding, positional encoding and latent space, attention and temperature.
  • Ex. prototype: certainty/temperature via pressure: “squeeze the user’s hand when LLM is uncertain” {…}. user response (pressure/grip) would then control the temperature parameter.

Materials produced during participatory design workshop on embodied explanations

Models of?

Interpretability and Accountability of AI Systems in the Sciences – Fabian Offert, Paul Kim, and Qiaoyu Cai – UCSB

Scopic Regimes of Neural Wiring – Leonardo Impett – U. Cambridge

📄 Fabian Offert, ‘On the Concept of History (in Foundation Models)’, IMAGE 37 (22 May 2023): 121–34, https://doi.org/10.1453/1614-0885-1-2023-15462.

📃 Fabian Offert, Paul Kim, and Qiaoyu Cai, ‘Synthesizing Proteins on the Graphics Card. Protein Folding and the Limits of Critical AI Studies’ (arXiv, 7 December 2024) https://doi.org/10.48550/arXiv.2405.09788.

📘 Fabian Offert, and Leonardo Impett, Vector knowledge (forthcoming).

  • Protein folding, the genetic code, and the pervasion of biosemiotic language paradigm in biology
  • What do large transformer models actually model? Are they producing new knowledge?
  • Language as the ‘the special case’ for transformers; reasoning about deep structure in sequences (Offert, Kim, and Cai 2024)

Pedagogies ofwithafter AI?

  • Cambridge – Dare – Generative AI UnToolkit
  • Karlsruhe – Caffoni – A Pedagogy of Machines
  • Kassel — Klumbytė/Draude – useXAI: The Use and Explainability Needs of Generative AI Tools among Students (led by Participatory IT Design group in Kassel, collaboration with Faculty of Social Sciences and Human Sciences) –

Medical AI

AI Interpretability and Linear Cause-Effect Models in Medicine: Is Non-Linear Diagnosis Possible? – Giulia Gandolfi (Postdoc) – Matteo Pasquinelli (Supervisor), KIM Research Group, HfG Karlsruhe

medical diagnoses based on machine learning tend to reinforce traditional linear-causal explanatory models, despite their inherently correlative nature. In other words, although AI outputs are fundamentally correlational, human experts often interpret them as causal relationships due to deeply rooted social and medical paradigms

Latent mechanistic interpretability – Patrick Leask (PhD student) and Noura Al Moubayed (Supervisor), Durham University

📄 Patrick Leask et al., ‘Sparse Autoencoders Do Not Find Canonical Units of Analysis’ (International Conference on Learning Representations, Singapore, 2025), repository

⚙️ Bart Bussmann, Patrick Leask, and Neel Nanda, ‘BatchTopK Sparse Autoencoders’, in NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning (Vancouver, 2024), https://openreview.net/forum?id=d4dpOCqybL.

🧷 Patrick Leask et al., ‘Stitching Sparse Autoencoders of Different Sizes’, in NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning (Vancouver, 2024), https://openreview.net/forum?id=VJ66JyKxgp.

Figure 4: Meta SAE Dashboard for feature exploration

Rediscovering the ‘soul of the project’

or: “the importance of meeting in-person”

the friction between the social and the technical

sensitivity to how ways of thinking about social forms hide their way into technical forms and have a sort of Nachleben

  • affinity/proximity to the notion of boundary object (Star 1989, 2010; Star and Ruhleder 1996)

  • understanding of these systems as epistemic machines

  • Glossary: Interpreting machine learning: a sociotechnical glossary

Output/s

Collective publication

📘 Matteo Pasquinelli, Claude Draude, Noura Al Moubayed, Leonardo Impett, and Fabian Offert (Eds.), Interpreting machine learning: a sociotechnical glossary (forthcoming/2025).

  • organised around boundary concepts
  • experimenting with presentation in latent space / with interpretability interfaces

Collective event

🗓 1st week of July,

📍 Berlin — Stay tuned!

Reflections

  • ‘Full stack’ sociotechnical approach
  • Discontinuous architectural advances & (scale) trajectories
  • Discursive multipolarity

References

Impett, Leonardo. 2023. “A History of Machine Visuality,” April. https://fromhypetoreality.com/.
Klumbyte, Goda, Hannah Piehl, and Claude Draude. 2023a. “Feminist Epistemology for Machine Learning Systems Design.” https://doi.org/10.48550/ARXIV.2310.13721.
———. 2023b. “Towards Feminist Intersectional XAI: From Explainability to Response-Ability.” https://doi.org/10.48550/ARXIV.2305.03375.
Kornweitz, Arif. 2023. “AI en de regels van de kunst.” Metropolis M 2023 (November). https://metropolism.com/nl/feature/50950_ai_en_de_regels_van_de_kunst/.
Offert, Fabian. 2023. “What Are Large Visual Models Models Of?” April. https://fromhypetoreality.com/.
Offert, Fabian, Paul Kim, and Qiaoyu Cai. 2024. “Synthesizing Proteins on the Graphics Card. Protein Folding and the Limits of Critical AI Studies.” https://doi.org/10.48550/ARXIV.2405.09788.
Pasquinelli, Matteo, and Arif Kornweitz. 2023. “The Sound of Multidimensional Space: How Avant-Garde Music Foreshadowed Artificial Intelligence.” In, 8693. Catalogue of the International Festival of Contemporary Music 67. Venice: La Biennale di Venezia; NERO.
Star, Susan Leigh. 1989. “Chapter 2 - the Structure of Ill-Structured Solutions: Boundary Objects and Heterogeneous Distributed Problem Solving.” In, edited by Les Gasser and Michael N. Huhns, 37–54. San Francisco (CA): Morgan Kaufmann. https://www.sciencedirect.com/science/article/pii/B978155860092850006X.
———. 2010. “This Is Not a Boundary Object: Reflections on the Origin of a Concept.” Science, Technology, & Human Values 35 (5): 601–17. https://doi.org/10.1177/0162243910377624.
Star, Susan Leigh, and Karen Ruhleder. 1996. “Steps Toward an Ecology of Infrastructure: Design and Access for Large Information Spaces.” Information Systems Research 7 (1): 111–34. https://doi.org/10.1287/isre.7.1.111.